Squibs and Discussions: Estimation of Probabilistic Context-Free Grammars
نویسندگان
چکیده
The assignment of probabilities to the productions of a context-free grammar may generate an improper distribution: the probability of all finite parse trees is less than one. The condition for proper assignment is rather subtle. Production probabilities can be estimated from parsed or unparsed sentences, and the question arises as to whether or not an estimated system is automatically proper. We show here that estimated production probabilities always yield proper distributions.
منابع مشابه
Studying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملSquibs and Discussions Nonminimal Derivations in Unification-based Parsing
Shieber’s abstract parsing algorithm (Shieber 1992) for unification grammars is an extension of Earley’s algorithm (Earley 1970) for context-free grammars to feature structures. In this paper, we show that, under certain conditions, Shieber’s algorithm produces what we call a nonminimal derivation: a parse tree which contains additional features that are not in the licensing productions. While ...
متن کاملEstimation of Consistent Probabilistic Context-free Grammars
We consider several empirical estimators for probabilistic context-free grammars, and show that the estimated grammars have the so-called consistency property, under the most general conditions. Our estimators include the widely applied expectation maximization method, used to estimate probabilistic context-free grammars on the basis of unannotated corpora. This solves a problem left open in th...
متن کاملA Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars
The paper gives a brief review of the expectation-maximization algorithm (Dempster, Laird, and Rubin 1977) in the comprehensible framework of discrete mathematics. In Section 2, two prominent estimation methods, the relative-frequency estimation and the maximum-likelihood estimation are presented. Section 3 is dedicated to the expectation-maximization algorithm and a simpler variant, the genera...
متن کاملProbabilistic Unification Grammars
Recent research has shown that unification grammars can be adapted to incorporate statistical information, thus preserving the processing benefits of stochastic context-free grammars while offering an efficient mechanism for handling dependencies. While complexity studies show that a probabilistic unification grammar achieves an appropriately lower entropy estimate than an equivalent PCFG, the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002